The NVIDIA Nemotron-H-8B-Base-8K is a large language model (LLM) developed by NVIDIA, designed to generate completions for given text fragments. The model adopts a hybrid architecture primarily composed of Mamba-2 and MLP layers, incorporating only four attention layers. It supports a context length of 8K and covers multiple languages including English, German, Spanish, French, Italian, Korean, Portuguese, Russian, Japanese, and Chinese.
Large Language Model
Transformers Supports Multiple Languages